Text categorization using automatically acquired domain ontology

نویسندگان

Shih-Hung Wu

Richard Tzong-Han Tsai

Wen-Lian Hsu

چکیده

In this paper, we describe ontology-based text categorization in which the domain ontologies are automatically acquired through morphological rules and statistical methods. The ontology-based approach is a promising way for general information retrieval applications such as knowledge management or knowledge discovery. As a way to evaluate the quality of domain ontologies, we test our method through several experiments. Automatically acquired domain ontologies, with or without manual editing, have been used for text categorization. The results are quite satisfactory. Furthermore, we have developed an automatic method to evaluate the quality of our domain ontology.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Hungarian Text Categorization Using Domain-Specific Ontology

The aim of Text Categorization is to automatically assign documents to a set of predefined categories. The prevailing approach is making use of a collection of precategorized examples for the induction of a document classifier through learning methods. In this paper we introduce a method which combines state-of-the-art learning techniques with background knowledge. We have used KAON ontology fo...

متن کامل

TelQAS: A Telecommunication Literature Question Answering System Benefits from a Text Categorization Mechanism

In this paper, we will propose TeLQAS, which is an ontology-based natural language question/answering system for the domain of Telecommunication Technologies. In an online process, the system accepts the users’ questions in English, and after retrieving the related text documents from either its local database or web; it summarizes the retrieved text documents with the highest relevance. The pr...

متن کامل

Learning Ontologies to Improve Text Clustering and Classification

Recent work has shown improvements in text clustering and classification tasks by integrating conceptual features extracted from ontologies. In this paper we present text mining experiments in the medical domain in which the ontological structures used are acquired automatically in an unsupervised learning process from the text corpus in question. We compare results obtained using the automatic...

متن کامل

SEWISE: An Ontology-based Web Information Search Engine

Since the begin of the 90's, the World Wide Web (WWW) rapidly guides the world into a newly amazing electronic village, where everybody can publish everything in electronic form and find almost all required information. The volume of available information is increasing exponentially in different formats, 80% being text. It remains hard to find interesting information directly from Web sources. ...

متن کامل

Automatic Evaluation of Search Ontologies in the Entertainment Domain using Text Classification

Information Retrieval (IR) research has recently started addressing the information need of exploratory search. where the searcher may be unfamiliar with the domain or not have decided what is the goal of his query. A popular tool to support exploratory search is the use of faceted search. The implementation of faceted search requires that documents be annotated by metadata in the form of attri...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2003

Text categorization using automatically acquired domain ontology

نویسندگان

چکیده

منابع مشابه

Improving Hungarian Text Categorization Using Domain-Specific Ontology

TelQAS: A Telecommunication Literature Question Answering System Benefits from a Text Categorization Mechanism

Learning Ontologies to Improve Text Clustering and Classification

SEWISE: An Ontology-based Web Information Search Engine

Automatic Evaluation of Search Ontologies in the Entertainment Domain using Text Classification

عنوان ژورنال:

اشتراک گذاری